40 research outputs found

    An introduction to crowdsourcing for language and multimedia technology research

    Get PDF
    Language and multimedia technology research often relies on large manually constructed datasets for training or evaluation of algorithms and systems. Constructing these datasets is often expensive with significant challenges in terms of recruitment of personnel to carry out the work. Crowdsourcing methods using scalable pools of workers available on-demand offers a flexible means of rapid low-cost construction of many of these datasets to support existing research requirements and potentially promote new research initiatives that would otherwise not be possible

    Single Tube, High Throughput Cloning of Inverted Repeat Constructs for Double-Stranded RNA Expression

    Get PDF
    BACKGROUND: RNA interference (RNAi) has emerged as a powerful tool for the targeted knockout of genes for functional genomics, system biology studies and drug discovery applications. To meet the requirements for high throughput screening in plants we have developed a new method for the rapid assembly of inverted repeat-containing constructs for the in vivo production of dsRNAs. METHODOLOGY/PRINCIPAL FINDINGS: The procedure that we describe is based on tagging the sense and antisense fragments with unique single-stranded (ss) tails which are then assembled in a single tube Ligase Independent Cloning (LIC) reaction. Since the assembly reaction is based on the annealing of unique complementary single stranded tails which can only assemble in one orientation, greater than ninety percent of the resultant clones contain the desired insert. CONCLUSION/SIGNIFICANCE: Our single-tube reaction provides a highly efficient method for the assembly of inverted repeat constructs for gene suppression applications. The single tube assembly is directional, highly efficient and readily adapted for high throughput applications

    Circular Polymerase Extension Cloning of Complex Gene Libraries and Pathways

    Get PDF
    High-throughput genomics and the emerging field of synthetic biology demand ever more convenient, economical, and efficient technologies to assemble and clone genes, gene libraries and synthetic pathways. Here, we describe the development of a novel and extremely simple cloning method, circular polymerase extension cloning (CPEC). This method uses a single polymerase to assemble and clone multiple inserts with any vector in a one-step reaction in vitro. No restriction digestion, ligation, or single-stranded homologous recombination is required. In this study, we elucidate the CPEC reaction mechanism and demonstrate its usage in demanding synthetic biology applications such as one-step assembly and cloning of complex combinatorial libraries and multi-component pathways

    Situated crowdsourcing using a market model

    No full text
    Research is increasingly highlighting the potential for situated crowdsourcing to overcome some crucial limitations of online crowdsourcing. However, it remains unclear whether a situated crowdsourcing market can be sustained, and whether worker supply responds to price-setting in such a market. Our work is the first to systematically investigate workers ’ behaviour and response to economic incentives in a situated crowdsourcing market. We show that the market-based model is a sustainable approach to recruiting workers and obtaining situated crowdsourcing contributions. We also show that the price mechanism is a very effective tool for adjusting the supply of labour in a situated crowdsourcing market. Our work advances the body of work investigating situated crowdsourcing. Author Keywords Crowdsourcing; virtual currency; market; situated technologies. ACM Classification Keywords H.5.m. Information interfaces and presentation (e.g., HCI)

    Random Partition Models for Microclustering Tasks

    No full text
    Traditional Bayesian random partition models assume that the size of each cluster grows linearly with the number of data points. While this is appealing for some applications, this assumption is not appropriate for other tasks such as entity resolution (ER), modeling of sparse networks, and DNA sequencing tasks. Such applications require models that yield clusters whose sizes grow sublinearly with the total number of data points—the microclustering property. Motivated by these issues, we propose a general class of random partition models that satisfy the microclustering property with well-characterized theoretical properties. Our proposed models overcome major limitations in the existing literature on microclustering models, namely a lack of interpretability, identifiability, and full characterization of model asymptotic properties. Crucially, we drop the classical assumption of having an exchangeable sequence of data points, and instead assume an exchangeable sequence of clusters. In addition, our framework provides flexibility in terms of the prior distribution of cluster sizes, computational tractability, and applicability to a large number of microclustering tasks. We establish theoretical properties of the resulting class of priors, where we characterize the asymptotic behavior of the number of clusters and of the proportion of clusters of a given size. Our framework allows a simple and efficient Markov chain Monte Carlo algorithm to perform statistical inference. We illustrate our proposed methodology on the microclustering task of ER, where we provide a simulation study and real experiments on survey panel data
    corecore